Skip to content

docs(api): enhance api descriptions for AI Agents#35382

Merged
rjvelazco merged 23 commits into
mainfrom
issue-35381-improve-openapi-endpoint-descriptions-based-on-mcp-testing-feedback
May 12, 2026
Merged

docs(api): enhance api descriptions for AI Agents#35382
rjvelazco merged 23 commits into
mainfrom
issue-35381-improve-openapi-endpoint-descriptions-based-on-mcp-testing-feedback

Conversation

@rjvelazco
Copy link
Copy Markdown
Member

@rjvelazco rjvelazco commented Apr 20, 2026

Summary

Improves OpenAPI endpoint descriptions based on findings from MCP integration testing with Claude Code. Documentation-only — no functional behavior changes. All edits live in Java @Operation / @Parameter / @RequestBody annotations; openapi.yaml is regenerated from those by the Swagger Maven plugin.

Closes #35381.

Endpoints touched

NavResourceGET /api/v1/nav/{uri}

  • Clarified that uri must resolve to a folder; page URIs return 404.
  • Documented / as the way to fetch the site-root tree.
  • Rewrote depth semantics (level-1 = starting node, < 1 coerced to 1, exceeding actual depth returns full subtree).
  • Corrected languageId description: tags nodes but does not translate folder names.
  • Typed depth / languageId as integers.

ContentTypeResource

  • POST /api/v1/contenttype — replaced the wrapper-style body description with the real contract: required clazz / name, enumerated valid clazz values, optional host/folder/workflow/fields/metadata/systemActionMappings, the singular-vs-plural workflowworkflows round-trip pitfall, field-object schema (incl. boolean-via-radio pattern), Row/Column layout encoding.
  • PUT /api/v1/contenttype/id/{idOrVar} — elevated the "destructive PUT" warning, documented the GET → mutate → PUT pattern with the workflows → workflow rename, and added a warning that systemActionMappings is validated on write.
  • DELETE /api/v1/contenttype/id/{idOrVar} — noted that entity is a JSON-encoded string, requiring JSON.parse.
  • POST /api/v1/contenttype/_filter — added comparison to GET /api/v1/contenttype?filter=.
  • POST /api/v1/contenttype/{variable}/_copy — documented what the copy preserves.
  • GET /api/v1/contenttype/id/{idOrVar} — explained how to inspect fields[].clazz to choose between ImmutableBinaryField and ImmutableImageField.

TempFileResourcePOST /api/v1/temp and POST /api/v1/temp/byUrl

  • Documented the canonical workflow: upload here → reuse tempFiles[0].id as the field value when firing a workflow action.
  • Added example response bodies.

WorkflowResource

  • PUT /api/v1/workflow/actions/default/fire/{systemAction} — full rewrite covering: request-body envelope, structured validation-error shape and known errorCode values, unknown fields silently dropped, Radio/Select/Checkbox values not validated, binary/image field workflows (Pattern A: single-use temp_<id>; Pattern B: reusable dotAsset for ImmutableImageField only), ImmutableBinaryField vs ImmutableImageField acceptance table, PUBLISH-on-archived warning, multi-scheme caveat, and indexPolicy=WAIT_FOR guidance.
  • PUT /api/v1/workflow/actions/{actionId}/fire — documented Move action body shape (pathToMove is a sibling of contentlet, not nested) and the Copy action (default System Workflow id, response returns source not copy, destination hints silently ignored).
  • Fire variants (by name / by name multipart / by id / by id multipart)indexPolicy documented as a typed [DEFER, WAIT_FOR, FORCE] enum with per-value descriptions.
  • POST .../actions/_bulkfire and PUT .../actions/bulk/fire — documented that contentletIds expects inodes (not identifiers), that additionalParams: {} is required, that the endpoint is not step-aware, and that the two paths are aliases differing only in transport.
  • GET /api/v1/workflow/contentlet/{inode}/actions — documented the empty-array-on-terminal-step ambiguity.
  • POST /api/v1/workflow/actions — documented that supplying stepId also attaches in one call, and the lack of an "anyone" constant.
  • POST /api/v1/workflow/steps — documented that escalationAction must be "" when enableEscalation: false.

Out of scope

Server-behavior proposals from the feedback doc (add COPY as a SystemAction, populate actionInputs[].body, return the new copy on Copy, validate radio/select values, fix the multipart 500-on-missing-file, eager multi-scheme init) were intentionally not addressed — this PR is documentation-only.

🤖 Generated with Claude Code

This PR fixes: #35381

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 20, 2026

Claude finished @rjvelazco's task in 3m 26s —— View job


Review — latest changes

The follow-up commits address the highest-leverage items from the previous AI-optimization pass: TempFilesView is typed, the recurring index-policy + bulk-fire warnings are extracted to constants, the fire-action 400 carries a real @ExampleObject, and MultiPartUtilsTest has the Given/Expected javadocs Dario asked for. The remaining items below are minor; none block merge.

What's now resolved

  • INDEX_POLICY_CHAINING_NOTE extracted to a single constant and reused in 6 operations — no more verbatim drift risk (WorkflowResource.java:219).
  • BULK_FIRE_CONTRACT_NOTES extracted and shared between PUT and POST bulk-fire descriptions (WorkflowResource.java:226).
  • TempFilesView is now final, defensively copies the list, and is referenced by @Schema(implementation = TempFilesView.class) on both temp endpoints — the openapi.yaml now emits a real TempFilesView component with tempFiles → DotTempFile[] instead of type: object.
  • 400 on putFireActionDefaultByName now has a structured @ExampleObject consumable from the standard OpenAPI path (responses['400'].content) — agents no longer have to NLP-parse the description (WorkflowResource.java:3026).
  • New MultiPartUtilsTest has proper Given/Expected javadocs per Dario's request and covers both null and empty-list paths.

Worth a second look

1. MultiPartUtils null-guard has broad blast radius beyond /api/v1/temp. The change at MultiPartUtils.java:112 affects ~9 callers (JsResource, OSGIResource, TagsResourceHelper, WorkflowResource, PushPublishFilterResource, AppsHelper, VTLResource, the OSGI v1 one). Anywhere those callers previously caught NPE via a global handler and surfaced "missing file" indirectly will now silently receive Collections.emptyList() and need to check the list themselves. Quick spot-check of PushPublishFilterResource.java:456 and AppsHelper.java:810 shows neither validates the list before reading element 0 — so a missing file part will now produce a different downstream NPE/IOOBE instead of the original one. Not strictly a regression (still fails), but the failure mode shifts. Either:

  • Confirm callers already validate, or
  • Have getBinariesFromMultipart return an empty list only for the temp-file path that legitimately tolerates it, and keep the contract elsewhere.

The simplest safe move: leave the guard as-is and add a Logger.debug so the silent-empty case is at least observable.

2. Scope creep in a docs(api) PR. The PR title and Closes #35381 framing remain "documentation-only", but this revision now ships a real defensive bugfix (MultiPartUtils NPE → empty list) plus a new value class (TempFilesView). Both are good changes; the concern is rollback granularity — if either of them needs to be reverted, the docs go with them. Consider either splitting into a follow-up fix(api): … PR, or amending the PR body so reviewers see the non-doc surface up front.

3. Inconsistent Java-source indentation in putContentTypeUpdate description. Continuation lines at ContentTypeResource.java:658-670 are flush-left to the description = token, while the corresponding createType block at :489-510 aligns continuations to the column after the opening quote. Stylistic only; consistency makes diffs cleaner. Fix this →

4. Hardcoded Copy action UUID in prose. WorkflowResource.java:4359 and :3017 embed 963f6a04-5320-42e7-ab74-6d876d199946 as "the Copy action id on the default System Workflow scheme." The text does point users at GET /api/v1/workflow/schemes/{schemeId}/actions to look it up, which is fine — but worth verifying this UUID is actually the seeded value across the dotCMS install base and not just one dev environment. If it varies, the example becomes misleading.

5. TempFilesView is unused as an actual return type. Both endpoints still build the response by hand: MultipleBinaryStreamingOutput (streaming JSON for the multipart endpoint) and ImmutableMap.of("tempFiles", tempFiles) for byUrl. So TempFilesView is a documentation-only contract — fine, common pattern — but worth a @VisibleForTesting-style comment so a future reader doesn't try to refactor the endpoints to return it directly and accidentally drop the streaming behavior. Optional.

6. MultiPartUtilsTest runs against mocks, not the real Jersey internals. The whole premise of the bugfix is "Jersey's FormDataMultiPart.getFields(name) returns null when no part matches" — but the test never instantiates a real FormDataMultiPart; it mocks the method to return null directly. So the test guards against the regression but doesn't prove that real Jersey behaves as the test description claims. Consider one additional test (or comment with a Jersey version reference) that documents why this null is possible. Minor.

Still open from prior review (acknowledged, intentionally deferred)

  • Enums in prose (clazz, dataType, systemAction) not yet promoted to @Schema(allowableValues = …).
  • TempFilesView not wrapped in a ResponseEntityView envelope — author confirmed the runtime returns the bare {tempFiles: [...]} shape and 5 clients depend on it. Follow-up PR will normalize.
  • Non-ASCII glyphs (⚠️, , , , ) still in descriptions — acceptable; flagging only for MCPs that token-budget by character.
  • Markdown table separator inconsistency within WorkflowResource (|-|-|-| vs |---|---|---|).

Verdict

Clean. The structural improvements are real and the new defensive code is reasonable. The blast-radius question on MultiPartUtils (#1) is the only thing I'd want a positive answer on before merging.


  • Inspect new commits since previous review
  • Review TempFilesView immutability change
  • Review MultiPartUtils null guard + new test
  • Check whether prior AI-optimization feedback was actioned
  • Post review
    · Branch

@rjvelazco rjvelazco marked this pull request as draft April 20, 2026 15:39
rjvelazco and others added 8 commits April 20, 2026 18:43
- Remove **Required.** from `variable` in copyType request body (only `name` is required)
- Move `variable`, `host`, `folder` from Required to Optional in createType (server-side defaults exist)
- Fully qualify all `clazz` names with `com.dotcms.contenttype.model.type.` prefix
- Fix `depth`/`languageId` schema format: int64 → int32 in openapi.yaml
- Remove PUBLISH-on-archived bug note from workflow fire endpoint (belongs in issue tracker, not API spec)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added the Area : Documentation PR changes documentation files label Apr 21, 2026
@rjvelazco rjvelazco changed the base branch from main to mcp-server-2 May 12, 2026 15:08
@rjvelazco rjvelazco changed the base branch from mcp-server-2 to main May 12, 2026 15:08
@rjvelazco rjvelazco marked this pull request as ready for review May 12, 2026 15:08
rjvelazco added 3 commits May 12, 2026 11:27
…-on-mcp-testing-feedback' of https://github.com/dotCMS/core into issue-35381-improve-openapi-endpoint-descriptions-based-on-mcp-testing-feedback
@rjvelazco
Copy link
Copy Markdown
Member Author

@claude review this changes but focus on how the docs are/can be optimized for AI

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 12, 2026

Claude finished @rjvelazco's task in 2m 52s —— View job


Review — AI-Optimization Focus

This is a strong documentation pass: the new prose captures most of the non-obvious contract details that bit you in MCP testing (binary vs image acceptance, workflows ↔ workflow rename, destructive PUT, inode-not-id in bulk-fire, indexPolicy timing). For human Swagger UI readers this is great. Below I focus only on how an AI consumer (MCP, code-gen client, OpenAPI-to-tool bridge) experiences the same content.

What's working well for AI consumers

  • Explicit "Pattern A / Pattern B" blocks under fire/{systemAction} — AI agents can copy-paste these flows.
  • Acceptance tables (ImmutableBinaryField vs ImmutableImageField accepts temp_<id> / dotAsset id) — exactly the shape an agent needs.
  • Cross-endpoint pointers ("call GET /api/v1/contenttype/id/{idOrVar} and inspect fields[].clazz") give a deterministic next step instead of trial-and-error.
  • Misleading-error callouts (titel → "title is required", Workflow-does-not-exists-content-type translation key) prevent an agent from getting stuck retrying the same broken input.
  • errorCode value enumeration (required, unknown) for the fire-action validation response.
  • Bulk-fire silent-noop warning is the single highest-value addition — an agent reading the response counts alone would never recover from passing identifiers.

Optimizations / issues for AI consumption

1. Repeated indexPolicy=WAIT_FOR paragraph is duplicated verbatim in ~6 operations (putFireActionByNameMultipart, putFireActionByName, putFireActionDefaultByName, putFireActionMultipart, putFireAction, fire/{actionId}). Every duplication is identically tokenized by an LLM. Consider extracting it to a shared constant or referencing a shared component (externalDocs.url or a top-level info.x-dotcms-guides.indexPolicy). Same for the contentletIds "expects inodes, not identifiers" warning, repeated three times across the bulk-fire variants.
Fix this →

2. Enums live in prose, not in schema.enum. AI agents discover acceptable values by inspecting schema.enum, not by NLP over a description paragraph. Several high-value enumerations are still narrative-only:

  • ContentTypeResource.createType — the 9 clazz values (ImmutableSimpleContentType, ImmutableWidgetContentType, …) live in @RequestBody.description text.
  • Field clazz values (ImmutableTextField, ImmutableBinaryField, ImmutableImageField, …).
  • dataType (TEXT, LONG_TEXT, SYSTEM, BOOL, INTEGER, FLOAT, DATE).
  • systemAction keys (NEW, EDIT, PUBLISH, UNPUBLISH, ARCHIVE, UNARCHIVE, DELETE, DESTROY) — these are already a SystemAction enum server-side, but @Schema on the path param doesn't expose them.

Promoting these to @Schema(allowableValues = …) (where the value is a field) or enum: on a typed sub-schema would convert "AI must read English" into "AI sees a structured list." indexPolicy is the right model — it already has allowableValues = {"DEFER", "WAIT_FOR", "FORCE"}.

3. depth / languageId schema-vs-impl divergence. The new @Schema(type = "integer") is correct for AI clients, but the Java params are still final String. A strict generated client will send 12 as a JSON number; JAX-RS receives it as the string "12" and Integer.parseInt still works — so it's safe today. But if you ever switch the parameter to a typed Integer, the existing string clients break. Worth flagging in the description that the param is also accepted as a numeric string, or retype the Java param to Integer to make the contract honest in both directions. (NavResource.java:118, NavResource.java:124)

4. The entity-is-a-string for DELETE /contenttype/id/{idOrVar} is documented in prose only. The @ApiResponse(200).content.schema still points at the same generic response view as siblings. An AI agent that introspects responses.200.content.application/json.schema will assume entity is an object. Better: introduce a small ResponseEntityDeletedContentTypeView whose entity property is type: string, so the schema itself reveals the quirk without the agent having to read the description.

5. @Schema(type = "object") in TempFileResource for both upload responses (uploadTempResourceMulti and copyTempFromUrl). The example payload is excellent, but type: object with no properties / additionalProperties means SDK generators emit Map<String, Object> and agents lose the shape of tempFiles[]. The CLAUDE.md in dotCMS/src/main/java/com/dotcms/rest/ explicitly forbids this pattern in favor of a named view class — recommend introducing ResponseEntityTempFilesView with a TempFileItem schema (id, fileName, length, mimeType, image, referenceUrl, metadata).
Fix this →

6. Move/Copy body shape is prose-only. The most error-prone fact in the PR — pathToMove must be a sibling of contentlet, not nested — is described in markdown but not modeled. An agent could pattern-match it incorrectly. A FireMoveRequest / FireCopyRequest schema with the correct nesting (oneOf on FireContentletRequest) would let the agent's type-checker fail fast on a wrong-shape body, rather than failing at runtime with the cryptic "The host path is not valid: null".

7. Mismatch between the createType description ("body is the content-type object directly, not wrapped in a contentType envelope") and the schema (@Schema(implementation = ContentTypeForm.class)). ContentTypeForm is exactly that envelope. So the textual claim contradicts what an AI client introspecting schema sees. Either the form needs @JsonValue / a custom serializer that exposes the raw type, or the description's "not wrapped" sentence is incorrect for code-gen consumers — they will wrap it.

8. Markdown tables in description strings are fragile. Several tables in ContentTypeResource and WorkflowResource are built by string concatenation; any future edit risks breaking a pipe alignment and downstream Markdown renderers (Stainless, Speakeasy, mintlify) handle malformed tables differently. Where each cell is small (3-4 rows), prefer a bulleted list — already used for clazz enumeration. Reserve tables for >5 rows. Also: half the tables in WorkflowResource use |-|-|-| separators, the other half use |-----|-----|-----| — inconsistent in the same file.

9. Non-ASCII characters (, , ⚠️, , ). They look great in Swagger UI but:

  • The openapi.yaml regenerator already escapes correctly, but some MCP servers token-budget by characters, not by glyphs, and an emoji adds 1–4 bytes for one semantic warning.
  • For code-gen pipelines that strip non-ASCII (some still do), / collapses to nothing, and the acceptance table becomes ambiguous (blank cells in ImmutableBinaryField row).
  • Consider [YES] / [NO] for table cells, WARNING: instead of ⚠️, -> instead of . The semantics survive any client.

10. New validation-error shape is described but not modeled. The example body for 400 ({ "errors": [{ "errorCode": "required", "fieldName": "image", "message": "…" }] }) only lives in prose under description. The @ApiResponse(responseCode = "400") has no @Content / @ExampleObject, so a tool calling responses['400'] finds nothing. Worth attaching the example to the actual 400 response so AI agents can find it via the standard OpenAPI path.

11. Copy action ?identifier=<source-id> query param. Documented in description text but I don't see a matching @Parameter @QueryParam("identifier") on fireAction(String actionId, …). If the Java signature doesn't expose it, code-gen clients won't know about it — only the prose-reading agent will. Verify and add the @Parameter if missing.

12. Scope creep — MultiPartUtils.java null-check + new test is a real (unrelated) defensive bugfix shipping in a "docs only" PR. The change itself is fine (fileParts == null guard preserves the empty-list contract). But it makes the PR title (docs(api): …) and the closing reference (Closes #35381) misleading. Consider splitting into its own commit/PR, or amending the PR description to call it out (already partially noted in the rollback-safety comment, but not in the PR body itself).

Suggested follow-up issues (out of PR scope)

  • Convert prose enumerations (clazz, dataType, systemAction) into typed schemas — biggest AI win per line of work.
  • Add a x-dotcms-field-acceptance extension on the field schema so MCP agents can programmatically check whether a value type is allowed without parsing markdown tables.
  • De-duplicate the indexPolicy=WAIT_FOR paragraph into a single constant or externalDocs URL.
  • Type the validation-error response shape (ValidationErrorEntity { errorCode: enum, fieldName: string|null, message: string }) and reference it from every fire-action 400.

Verdict

Content quality is excellent — this captures the hard-earned MCP testing knowledge. The main lift to make it AI-optimal rather than just AI-friendly is structural: pull enums out of prose into schema.enum, model the response shapes that currently live as inline examples, and de-duplicate the recurring paragraphs. None of those block merging.


  • Read NavResource.java changes
  • Read ContentTypeResource.java changes
  • Read TempFileResource.java changes
  • Read WorkflowResource.java changes
  • Read openapi.yaml snippet for AI consumption check
  • Cross-check MultiPartUtils.java defensive fix vs. described docs contract
  • Compose AI-focused review feedback
    · Branch

rjvelazco added 2 commits May 12, 2026 11:50
…-on-mcp-testing-feedback' of https://github.com/dotCMS/core into issue-35381-improve-openapi-endpoint-descriptions-based-on-mcp-testing-feedback
Comment thread dotCMS/src/test/java/com/dotcms/rest/api/MultiPartUtilsTest.java
Comment thread dotCMS/src/main/java/com/dotcms/rest/api/v1/temp/TempFilesView.java Outdated
rjvelazco added 2 commits May 12, 2026 13:20
…acefully. Update MultiPartUtilsTest to improve coverage for multipart request scenarios, ensuring no NullPointerException is thrown when file parts are missing or empty.
…-on-mcp-testing-feedback' of https://github.com/dotCMS/core into issue-35381-improve-openapi-endpoint-descriptions-based-on-mcp-testing-feedback
@rjvelazco rjvelazco added this pull request to the merge queue May 12, 2026
Merged via the queue into main with commit 04ddc79 May 12, 2026
51 checks passed
@rjvelazco rjvelazco deleted the issue-35381-improve-openapi-endpoint-descriptions-based-on-mcp-testing-feedback branch May 12, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code Area : Documentation PR changes documentation files

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Improve OpenAPI endpoint descriptions based on MCP testing feedback

3 participants